Skip to content

feat: area pre-rtree lookup table#339

Merged
jfberry merged 88 commits intomainfrom
len-optmize-match-stats
Mar 20, 2026
Merged

feat: area pre-rtree lookup table#339
jfberry merged 88 commits intomainfrom
len-optmize-match-stats

Conversation

@lenisko
Copy link
Copy Markdown
Contributor

@lenisko lenisko commented Mar 8, 2026

Create a cellIds>areaName mapping for all geofence areas we load (excluding cells on fence edges).
On each lookup try to use cellId to get area names and fallback to rtree when not found.

At example, I used 10MB geofence file, lookup table is taking 616MB and CPU usage goes ~2.5 times down. Startup is delayed by ~15 seconds to pregenerate all cellids.

To test make changes to config

[tuning]
s2_cell_lookup = true

Test output (interesting part)

    geography_test.go:41: Built S2 lookup with 3843015 cells, size: 616.86 MB
    geography_test.go:86: Results: 50000 points tested
    geography_test.go:87: S2 lookup hits: 33864 (67.73%)
    geography_test.go:88: Rtree fallback: 16136 (32.27%)
    geography_test.go:89: Mismatches: 0 (0.00%)
    geography_test.go:93: Note: S2 lookup may return fewer areas for edge cells - this is expected
    geography_test.go:95: Timing: rtree only: 18.393081479s (367.86 µs/call)
    geography_test.go:96: Timing: lookup+fallback: 2.691633585s (53.83 µs/call)
    geography_test.go:97: Speedup: 6.83x
--- PASS: TestMatchStatsGeofenceWithCellVsRtree (33.54s)

jfberry and others added 30 commits January 25, 2026 16:15
Instead of spawning a goroutine for each wild pokemon that sleeps
waiting for an encounter, use a single pending queue with a background
sweeper. This reduces memory pressure and goroutine overhead under
high load.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolved conflicts by keeping typed webhook structs from json_structure
while using dirty flag pattern (oldValues, newRecord) for change detection.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@lenisko lenisko requested a review from jfberry March 9, 2026 01:27
@jfberry
Copy link
Copy Markdown
Collaborator

jfberry commented Mar 9, 2026

Does this cope with geofence reloads?

@lenisko
Copy link
Copy Markdown
Contributor Author

lenisko commented Mar 9, 2026

Thanks, that's a good point. I managed to forget we have reload on geofence - It wouldn't crash, but there was a race.

Fixed with last commit.

@lenisko lenisko changed the base branch from optimise-coalesce to main March 9, 2026 09:55
@lenisko
Copy link
Copy Markdown
Contributor Author

lenisko commented Mar 16, 2026

p-1773658919-firefox

Along with other improvements and GC changes, CPU went down almost 3 times :)

@jfberry
Copy link
Copy Markdown
Collaborator

jfberry commented Mar 18, 2026

Review feedback

1. nestTree should also use atomic.Value for consistency

statsTree was changed from *rtree.RTreeG to atomic.Value to support concurrent access during geofence reload. nestTree has the same access pattern but is still a bare pointer. Both should use atomic.Value or neither should.

Action: Change nestTree to atomic.Value and update MatchNestGeofence to load from it the same way MatchStatsGeofenceWithCell loads from statsTree.

2. Move build-phase mutex out of S2CellLookup struct

S2CellLookup.mu is only used during BuildS2LookupFromFeatures to protect concurrent writes from worker goroutines. After build, the map is immutable and Lookup() reads without locking. The mutex should not live in the struct.

Action: Remove mu sync.Mutex from S2CellLookup. Instead, use a local sync.Mutex inside BuildS2LookupFromFeatures and pass it to addArea/addEdgeCell (or make them unexported helper closures within the build function). This makes the immutability contract explicit — the struct returned from build has no synchronization because it needs none.

3. Config comment should mention memory cost

The 10MB geofence → 616MB lookup table is a 60x memory blowup. The config comment just says "Use S2 cell lookup before rtree" which doesn't hint at the tradeoff.

Action: Update the config comment and example to mention the memory cost, e.g.:

s2_cell_lookup = false      # Pre-compute S2 cell lookup for faster geofence matching. Trades memory (~60x geofence file size) for ~2-7x faster lookups. (default: false)

4. Test improvements

The test has //go:build ignore and requires a local geofence file, so it can never run in CI. Also, the comment "S2 lookup may return fewer areas for edge cells - this is expected" is misleading — the design correctly removes edge cells entirely so the rtree fallback always gives the full answer. Mismatches should always be 0.

Action:

  • Change the mismatch comment to: "Mismatches should be 0 — edge cells are excluded from the S2 lookup, so the rtree fallback always provides the complete result."
  • Assert mismatches == 0 with t.Errorf so the test actually fails if the invariant breaks.
  • Consider adding a small unit test (without //go:build ignore) that constructs a synthetic geofence in-memory and verifies S2 lookup + fallback matches rtree for points inside, on edges, and outside.

@jfberry
Copy link
Copy Markdown
Collaborator

jfberry commented Mar 18, 2026

(I asked claude to review in a form that you could feed straight to your AI :-)

@lenisko
Copy link
Copy Markdown
Contributor Author

lenisko commented Mar 18, 2026

@jfberry :) done

@jfberry jfberry merged commit 26686d1 into main Mar 20, 2026
4 checks passed
@Fabio1988 Fabio1988 deleted the len-optmize-match-stats branch March 20, 2026 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants